home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Internet Info 1993
/
Internet Info CD-ROM (Walnut Creek) (1993).iso
/
inet
/
internet-drafts
/
draft-ietf-avt-video-packet-01.txt
< prev
next >
Wrap
Text File
|
1993-06-07
|
26KB
|
941 lines
Internet Engineering Task Force Audio-Video Transport WG
INTERNET-DRAFT T.Turletti, C. Huitema
INRIA
May 93
Expires: Nov. 93
Packetization
of
H.261 video streams
May 28, 1993
Thierry Turletti, Christian Huitema
INRIA
Christian.Huitema@sophia.inria.fr
Thierry.Turletti@sophia.inria.fr
1. Status of this Memo
This document is an Internet draft. Internet drafts are
working documents of the Internet Engineering Task Force
(IETF), its Areas, and its Working Groups. Note that other
groups may also distribute working documents as Internet
Drafts).
Internet Drafts are draft documents valid for a maximum of six
months. Internet Drafts may be updated, replaced, or obsoleted
by other documents at any time. It is not appropriate to use
Internet Drafts as reference material or to cite them other
than as a "working draft" or "work in progress".
Please check the I-D abstract listing contained in each
Internet Draft directory to learn the current status of this
or any other Internet Draft.
Distribution of this document is unlimited.
INTERNET-DRAFT Packetization of H.261
2. Abstract
This draft describes a packetization scheme of H.261 video
stream. The scheme proposed can be used to transport such a
video flow over the protocols used by RTP.
This specification is a product of the Audio-Video Transport
working group within the Internet Engineering Task Force.
Comments are solicited and should be addressed to the
working group's mailing list at rem-conf@es.net and/or the
authors.
Turletti, Huitema [Page 2]
INTERNET-DRAFT Packetization of H.261
3. Purpose of this document
The CCITT recommendation H.261 [1] specifies the encodings
used by CCITT compliant video-conference codecs. Although
these encodings were originally specified for fixed data rate
ISDN circuits, experimentations [2] have shown that they can
also be used over the internet.
The purpose of this memo is to specify how H.261 video streams
can be carried over the protocols used by RTP [3], such as
UDP, ST-II, etc.
4. Structure of the packet stream
H.261 codecs produce a bit stream. In fact, H.261 and
companion recommendations specify several levels of encoding:
(1) Images are first separated in blocks of 8x8 pixels.
Blocks which have moved are encoded by computing the
discrete cosine transform (DCT) of their coefficients,
which are then quantized and Huffman encoded.
(2) The bits resulting of the Huffman encoding are then
arranged in 512 bits frames, containing 2 bits of
synchronization, 492 bits of data and 18 bits of error
correcting code.
(3) The 512 bits frames are then interlaced with an audio
stream and transmitted over px64 kbps circuits according
to specification H.221.
When transmitting over the Internet, we will directly consider
the output of the Huffman encoding. We will not carry the 512
bits frames, as protection against errors can be obtained by
other means. Similarly, we will not attempt to multiplex audio
and video signals in the same packets, as UDP and RTP provide
a much more efficient way to achieve multiplexing.
Directly transmitting the result of the Huffman encoding over
an unreliable stream of UDP datagrams would however have very
poor error resistance characteristics. The H.261 coding is in
fact organized as a sequence of images, or frames, which are
themselves organized as a set of Groups of Blocks (GOB). Each
GOB holds a set of 3 lines of 11 macro blocs (MB). Each MB
Turletti, Huitema [Page 3]
INTERNET-DRAFT Packetization of H.261
carries information on a group of 16x16 pixels: luminance
information is specified for 4 blocks of 8x8 pixels, while
chrominance information is only given by two color difference
components 8x8 "red" and "blue" blocks. These components and
the codes representing their sampled values are as defined in
the CCIR Recommendation 601.
This grouping is used to specify informations at each level of
the hierarchy:
- At the frame level, one specifies informations such as
the delay from the previous frame, the image format, and
various indicators.
- At the GOB level, one specifies the GOB number and the
default quantifier that will be used for the MBs.
- At the MB level, one specifies which blocks are presents
and which did not change, and optionally a quantifier, as
well as precisions on the codings such as distance
vectors.
The result of this structure is that one need to receive the
informations present in the frame header to decode the GOBs,
as well as the informations present in the GOB header to
decode the MBs. Without precautions, this would mean that one
has to receive all the packets that carry an image in order to
properly decode its components. In fact, the experience as
shown that:
(1) It would be unrealistic to carry an image on a single
packet: video images can sometimes be very large.
(2) GOB informations typically fits in a packet. In fact,
several GOBs can often be grouped in a packet.
Once we have take the decision to correlate GOB
synchronization and packetization, a number of decisions
remain to be taken, due to the following conditions:
(1) The algorithm should be easy to implement when
packetizing the output stream of a hardware codec.
(2) The algorithm should not induce rendition delays -- we
should not have to wait for a following packet to display
Turletti, Huitema [Page 4]
INTERNET-DRAFT Packetization of H.261
an image.
(3) The algorithm should allow for efficient
resynchronization in case of packet losses.
(4) It should be easy to depacketize the data stream and
direct it to an hardware codec's input.
(5) When the hardware decoder operates at a fixed bit rate,
one should be able to maintain synchronization, e.g. by
adding padding bits when the packet arrival rate is
slower than the bit rate.
The H.261 Huffmans encoding includes a special "GOB start"
pattern, composed of 15 zeroes followed by a single 1, that
cannot be imitated by any other code words. That patterns mark
the separation between two GOBs, and is in fact used as an
indicator that the current GOB is terminated. The encoding
also include a stuffing pattern, composed of seven zeroes
followed by four ones; that stuffing pattern can only be
entered between the encoding of MBs, or just before the GOB
separator.
The first conclusion of the analysis is that the packets
should contain all the GOB data, including the "GOB start"
pattern that separate the current block from its follower.
Actually, as this pattern is well known, we could as well use
a single bit in the data header to indicate that a GOB-start
pattern must be added at the decoder side.
Not encoding the GOB-start pattern has two advantages:
(1) It reduces the number of bits in the packets, and avoids
the possibility of splitting packets in the middle of a
GOB separator.
(2) It authorizes gateways to hardware decoders to insert the
stuffing pattern in front of the GOB, in order to meet
the fixed bit rate requirement.
Another problem posed by the specificities of the H.261
compression is that the GOB data have no particular reason to
fit in an integer number of octets. The data header will thus
contain two three bits integers, EBIT and SBIT:
Turletti, Huitema [Page 5]
INTERNET-DRAFT Packetization of H.261
SBIT indicates the number of bits that should be ignored in
the first (start) data octet.
EBIT indicates the number of bits that should be ignored in
the last (end) data octet.
Although only the EBIT counter would really be needed for
software coders, the SBIT counter was inserted to ease the
packetization of hardware coders output. A sample
packetization procedure is found in annex A.
At the receiving sites, the GOB synchronization can be used in
conjunction with the synchronization service of the RTP
protocol. In case of losses, the decoders could become
desynchronized. The "S" bit of the H.261 option field will be
set to indicate that the packet includes the beginning of the
encoding of a GOB, i.e. the quantifier common to all macro
blocks. The receiver will detect losses by looking at the RTP
sequence numbers. The receiver may either resequence out of
order packets or merely drop them. In case of losses, it will
ignore all packets whose "S" bit is null. Once an S bit packet
has been received, it will prepend the GOB start code to that
packet, and resume decoding.
An example packetization program is given in Appendix A.
Turletti, Huitema [Page 6]
INTERNET-DRAFT Packetization of H.261
5. Usage of RTP
The H.261 informations are carried as data within the RTP
protocol, using the following informations:
_____________________________________________
| Ver | Protocol version (1). |
|___________|________________________________|
| Flow | Identifies one particular |
| | video stream. |
|___________|________________________________|
| Content | H.261 encoded video (31). |
|___________|________________________________|
| Sequence | Identifies the packet within |
| number | a stream |
|___________|________________________________|
| Sync | Set if the packet |
| | includes the end of an image.|
|___________|________________________________|
| Timestamp | The date at which the |
| | image was grabbed. |
|___________|________________________________|
The very definition of this settings implies that the
beginning of an image shall always be synchronized with a
packet. The RTP sequence number can be used to detect missing
packets. In this case, one shall ignore all incomings packets
until the next synchronization mark is received. The "Sync"
bit can be used as a flag to trigger display the new image on
the screen. The H.261 data will follow the RTP options, as
in:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
. .
. RTP header + RTP options (optional) .
. .
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| H.261 options | H.261 stream... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The H.261 options field is defined as following:
Turletti, Huitema [Page 7]
INTERNET-DRAFT Packetization of H.261
0 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|S|SBIT |E|EBIT |C|I|V|MBZ| FMT |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
_______________________________________________________
| S (1 bit) | Start of GOB. Set if |
| | the packet is a start of GOB. |
|_______________|______________________________________|
| SBIT (3 bits) | Start bit position |
| | number of bits that should |
| | be ignored in the first |
| | (start) data octet. |
|_______________|______________________________________|
| E (1 bit) | End of GOB. Set if |
| | the packet is an end of GOB. |
|_______________|______________________________________|
| EBIT (3 bits) | End bit position |
| | number of bits that should |
| | be ignored in the last |
| | (end) data octet. |
|_______________|______________________________________|
| C (1 bit) | Color flag. Set if |
| | color is encoded. |
|_______________|______________________________________|
| I (1 bit) | Full Intra Image flag. |
| | Set if it is the first packet |
| | of a full intra image. |
|_______________|______________________________________|
| V (1 bit) | movement Vector flag. |
| | Set if movement vectors |
| | are encoded. |
|_______________|______________________________________|
| FMT (3 bits) | Image format: |
| | QCIF, CIF or number of CIF in SCIF.|
|_______________|______________________________________|
| MBZ (2 bits) | Must be zero. |
|_______________|______________________________________|
The image format (3 bits) is defined as following:
Turletti, Huitema [Page 8]
INTERNET-DRAFT Packetization of H.261
____________________________
| QCIF | 000|
|____________________|______|
| CIF | 001|
|____________________|______|
| SCIF 0 | |
| upper left corner | 100|
| CIF in SCIF image | |
|____________________|______|
| SCIF 1 | |
| upper right corner | 101|
| CIF in SCIF image | |
|____________________|______|
| SCIF 2 | |
| lower left corner | 110|
| CIF in SCIF image | |
|____________________|______|
| SCIF 3 | |
| lower right corner | 111|
| CIF in SCIF image | |
|____________________|______|
With:
- CIF: Common interchange format for video images with 352
x 288 pixels.
- QCIF: Quarter CIF with 176 x 144 pixels.
- SCIF: Super CIF with 704 x 288 pixels.
Turletti, Huitema [Page 9]
INTERNET-DRAFT Packetization of H.261
6. Usage of RTP parameters
When sending or receiving H.261 streams through the RTP
protocol, the stations should be ready to:
(1) process or ignore all generic RTP parameters,
(2) send or receive H.261 specific "Reverse Application Data"
parameters, to request a video resynchronization.
This memo describes two "RAD" item types, "Full Intra Request"
and "Negative Acknowledge".
6.1. Controlling the reverse flow
Support of the reverse application data by the H.261 sender is
optional; in particular, early experiments have shown that the
usage of this feature could have very negative effects when
the number of recipients is very large.
Recipients learn the return address where RAD informations may
be sent from the Content description (CDESC) item, which may
be included as an RTP option in any of the video packets. The
CDESC item includes a Return port number value. A value of
zero indicates that no reverse control information should be
returned.
A recipient shall never send a RAD item if it has not yet
received a CDESC item from the source, or if the port number
received in the last CDESC item was null.
Emitters should identify themselves by sending CDESC items at
regular intervals.
6.2. Full Intra Request
The "Full Intra Request" items are identified by the item Type
"FIR" (0).
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| RAD | length = 1 | Type | Z | Flow |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Turletti, Huitema [Page 10]
INTERNET-DRAFT Packetization of H.261
These packets indicate that a recipient has lost all video
synchronization, and request the source to send the next image
in "Intra" coding mode, i.e. without using differential
coding. The various fields are defined as follow:
________________________________________________
| F | Last option bit, as defined by RTP.|
|________|______________________________________|
| RAD | RAD option type (65) |
|________|______________________________________|
| Length | In 32-bits word. |
|________|______________________________________|
| Type | FIR (0). |
|________|______________________________________|
| Z | Must be zero |
|________|______________________________________|
| Flow | The flow id of the incoming packets|
|________|______________________________________|
6.3. Negative Acknowledgements
Packet losses are detected using the RTP sequence number.
After a packet loss, the receiver will resynchronize on the
next GOB. However, as H.261 uses differential encoding, parts
of the images may remain blurred for a rather long time.
As all GOB belonging to a given video image carry the same
time stamp, the receiver can determine a list of GOBs which
were really received for that time stamp, and thus identify
the "missing blocks". Requesting a specific reinitialization
of these missing blocks is more efficient than requesting a
complete reinitialization of the image through the "Full Intra
Request" item.
Turletti, Huitema [Page 11]
INTERNET-DRAFT Packetization of H.261
The format of the video-nack option is as follow:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F| RAD | length = 3 | Type | Z | Flow |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| FGOBL | LGOBL | MBZ | FMT |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| timestamp (seconds) | timestamp (fraction) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The different fields have the following values:
_____________________________________________________
| F | Last option bit, as defined by RTP. |
|___________|________________________________________|
| RAD | RAD option type (65) |
|___________|________________________________________|
| Length | Three 32 bits word. |
|___________|________________________________________|
| Type | NACK (1). |
|___________|________________________________________|
| MBZ | Must be zero |
|___________|________________________________________|
| Flow | The flow id of the incoming packets |
|___________|________________________________________|
| FGOBL | First GOB Lost: |
| | Identifies the first GOB lost number.|
|___________|________________________________________|
| LGOBL | Last GOB Lost: |
| | Identifies the last GOB lost number. |
|___________|________________________________________|
| MBZ | Must be zero |
|___________|________________________________________|
| FMT | Repeat the format indicator of the |
| | received image, including the number |
| | of the SCIF subimage if present. |
|___________|________________________________________|
| Timestamp | The RTP timestamp of the |
| | original image |
|___________|________________________________________|
Turletti, Huitema [Page 12]
INTERNET-DRAFT Packetization of H.261
7. References
[1] Video codec for audiovisual services at p x 64 kbit/s
CCITT Recommendation H.261, 1990.
[2] Thierry Turletti. H.261 software codec for
videoconferencing over the Internet INRIA Research Report
no 1834, January 1993.
[3] Henning Schulzrinne A Transport Protocol for Real-Time
Applications INTERNET-DRAFT, December 15, 1992.
Turletti, Huitema [Page 13]
INTERNET-DRAFT Packetization of H.261
Appendix A
The following code can be used to packetize the output of an
H.261 codec:
#include <stdio.h>
#define BUFFER_MAX 512
int right[] = {
/* Number of successives zeroes starting at the MSB for
each octet */
8,7,6,6,5,5,5,5,4,4,4,4,4,4,4,4,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,
2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0};
int left[] = {
/* Number of successives zeroes starting at the LSB for
each octet */
8,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
5,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
6,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
5,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
7,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
5,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
6,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,
5,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0};
h261_sync(F)
FILE *F;
{
int i, ebit, sbit, start_of_group, end_of_group,
c, nz;
unsigned char buf[BUFFER_MAX];
int *left, *right;
i = 0;
ebit = 0;
sbit = 0;
start_of_group = 1;
Turletti, Huitema [Page 14]
INTERNET-DRAFT Packetization of H.261
nz = 0;
while (c = getc(F)) {
buf[i++] = c;
if (c == 0) {
nz += 8;
} else {
nz += right[c];
end_of_group = 1;
if (nz >= 15) {
if (right[c] == 7) {
ebit = 0;
send_message(buf, i - 2, sbit, ebit,
end_of_group, start_of_group);
sbit = 0;
i = 0;
} else {
ebit = 7 - right[c];
send_message(buf, i - 2, sbit, ebit,
end_of_group, start_of_group);
i = 0;
buf[i++] = c;
sbit = right[c] + 1;
}
start_of_group = 1;
} else {
nz = left[c];
if (i >= BUFFER_MAX) {
ebit = 0;
end_of_group = 0;
send_message(buf, i - 2, sbit, ebit,
end_of_group, start_of_group);
buf[0] = buf[i - 2];
buf[1] = buf[i - 1];
i = 2;
sbit = 0;
start_of_group = 0;
}
}
}
}
}
Turletti, Huitema [Page 15]
INTERNET-DRAFT Packetization of H.261
Table of Contents
1 Status of this Memo ................................... 1
2 Abstract .............................................. 2
3 Purpose of this document .............................. 3
4 Structure of the packet stream ........................ 3
5 Usage of RTP .......................................... 7
6 Usage of RTP parameters ............................... 10
6.1 Controlling the reverse flow ........................ 10
6.2 Full Intra Request .................................. 10
6.3 Negative Acknowledgements ........................... 11
7 References ............................................ 13
Appendix A ............................................. 14
Turletti, Huitema [Page 16]